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Abstract 

A family of secant methods based on general rank-1 updates has 
been revisited in view of the construction of iterative solvers for 
large non-Hermitian linear systems. As it turns out, both Broy- 
den’s “good” and “bad” update techniques play a special role — but 
should be associated with two different line search principles. For 
Broyden’s “bad” update technique, a minimum residual principle 
is natural — thus making it theoretically comparable with a series 
of well-known algorithms like GMRES. Broyden’s “good” update 
technique, however, is shown to be naturally linked with a mini- 
mum “next correction” principle — which asymptotically mimics 
a minimum error principle. The two minimization principles differ 
significantly for sufficiently large system dimension. Numerical ex- 
periments on discretized PDE’s of convection diffusion type in 2-D 
with internal layers give a first impression of the possible power of 
the derived “good” Broyden variant. 


Key Words: nonsymmetric linear system, secant method, rank-1 

update, Broyden’s method, line search, GMRES. 
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1. Introduction 

The solution of large sparse systems of linear equations 


Ax — b (1-1) 

is one of the most frequently encountered tasks in numerical computations. 
In particular, such systems arise from finite difference or finite element ap- 
proximations to partial differential equations (PDEs). For Hermitian positive 
definite coefficient matrices A , the classical conjugate gradient method (CG) of 
HESTENES/STIEFEL [11] is one of the most powerful iterative techniques for 
solving (1.1). 

In recent years, a number of CG type methods for solving general non-Hermitian 
linear systems (1.1) have been proposed. The most widely used of these algo- 
rithms is GMRES due to SAAD/SCHULTZ [13]. However, solving non-Hermitian 
linear systems is, in general, by fan more difficult than the case of Hermitian 
A, and the situation is still not very satisfactory. For instance, this is reflected 
in the fact that for methods such as GMRES work and storage per iteration 
grow linearly with the iteration number k. Consequently, in practice, one can 
not afford to run the full algorithm and restarted or truncated versions are 
used instead. Notice that, on the contrary, CG for Hermitian A is based on a 
three-term recursion and thus work and storage per iteration remain constant. 

Non-Hermitian linear systems (1.1) are special cases of systems of nonlinear 
equations. For sufficiently good initial guesses, secant methods (see e.g. Den- 
NIS /SCHNABEL [3]) based on Broyden’s rank-1 updates are known to be quite 
efficient techniques for solving these more general problems. However, up to 
now, se can t methods for solving linear systems have had a bad reputation. 

The purpose of this paper is to take an unusual look at secant methods for 
non-Hermitian linear systems (1.1). In particular, as will be shown, combining 
Broyden’s good and bad updates with different line search principles leads to 
iterative schemes which are competitive with GMRES. More than that, these 
secant methods typically exhibit a better reduction of the Euclidean error than 
GMRES. This is of particular importance for solving linear systems which arise 
in the context of multilevel discretizations of PDEs. There, linear systems are 
only solved to am accuracy corresponding to the discretization error on the 
respective level. In order to obtain such approximate solutions with as few iter- 
ations as possible, reduction of the Euclidean error is typically more crucial than 
minimizing the residual norm as GMRES does. For a description of such mul- 
tilevel techniques, see the recent paper of DEUFLHARD/LEINEN/YSERENTANT 

[ 5 ]- 
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It is well known (see e.g. FLETCHER [7, Chapter 3]) that CG for Hermitian 
positive definite A is intimately connected with minimization algorithms based 
on Broyden’s family of rank-2 updates. In view of this result, the similar 
behavior of GMRES and secant methods based on rank-1 updates might not 
come as a surprise. Nevertheless, there appears to be no strict connection 
between the two techniques. Recently, however, ElROLA/NEVANLINNA [6] have 
established a connection between GMRES and a certain rank-1 update based 
on a nonstandard secant condition (cf. Remark 1 in Section 2.1). 

The paper is organized as follows. In Section 2.1, we introduce a general fam- 
ily of secant methods. In Sections 2.2 and 2.3, special rank-1 updates and line 
search principles, respectively, are discussed. In Sections 3.1 and 3.2, we present 
convergence results for secant methods based on Broyden’s bad and good up- 
dates. These results are then illustrated for a linear system arising from a 
simple 1-D boundary value problem in Section 3.3. Next, we discuss actual 
implementations of the proposed secant methods in Section 4. Typical numer- 
ical experiments axe reported in Section 5. Finally, we make some concluding 
remarks. 

Throughout this paper, all vectors and matrices are assumed to be complex. 
As usual, M* = (TTTjty) denotes the conjugate transpose of the matrix M = 
(mj k ). The vector norm ||x|| = y/x m x is always the Euclidean norm and [|M|| = 
sup|| x || =1 ||Afx|| the corresponding matrix norm. Occasionally, the Frobenius 
norm ||M||f = (T,j,k |”ijfc| 2 ) 1/2 ^11 be used. 
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2. A Family of Secant Methods 


The paper deals with the solution of linear systems (1.1) where A is a non- 
Hermitian nxn matrix and b € C". From now on, it is always assumed that A 
is nonsingular, and x := A~ l b denotes the exact solution of (1.1). 

The methods studied in this paper are iterative schemes. For any given starting 
vector x 0 € C”, a sequence of approximations x*, k = 1,2, . . ., to x is computed. 
Furthermore, in each step an n x n matrix H k which approximates A- 1 is 
generated. Here Ho is a given nonsingular initial approximation of A -1 . 

In the sequel, 

e* := x — x \ t and r k := b — Ax* 

always denote the error vector and residual vector , respectively, corresponding 
to the iterate x*. Moreover, 

E k := I — H k A 

is the error matrix associated with the “preconditioning” matrix H k and 


A* := H k r k 


is the “preconditioned” residual vector. Finally, for nonsingular H k , we denote 
by 

B k := H k l 


the approximations of A. 


2.1 The General Algorithm 

The approximation ffjt+i of A is obtained from the one of the previous iteration, 
H k , by adding a rank-1 correction. In conjunction with the requirement that 
the following secant condition (or quasi-Newton condition ) 

Hk+iAA k = A* (2.1) 

holds, this leads (see e.g. [3, Chapter 8]) to the general update 

+ ( 2 - 2 ) 

due to BROYDEN [1]. Here, v k 6 C is any vector such that vlH k AA k ± 0. By 
applying the Sherman-Morrison formula to (2.2), one readily verifies that H k + 1 
is nonsingular with inverse 

R+i = B t + (A - , (2-3J 

v k A k 

as long as H k is nonsingular and u£A* ^ 0. 
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Remark 1. ElROLA/NEVANLINNA [6] study secant methods which are based 
on the “conjugate transposed” secant condition 

Hl +1 A*c k = c k (2.4) 

instead of (2.1). For the special choice c k = AA k in (2.4), the resulting algorithm 
([6], see also [14]) is mathematically equivalent to GMRES. 

In each iteration, the new approximation x k+ i to x is obtained by correcting 
the previous iterate x k along the preconditioned residual A k . In combination 
with the update (2.2), this leads to the following informal algorithm. 

Algorithm 2.1 


Start: a) ro := b— Ax o 

Iteration loop: k = 0, 1, . . . : 

b) A* := H k r k 
q k := AA k 
z k := H k q k 

c) x k+i := x k + t k A* 
nt+i : = r k - t k q k 

Update: 

d) H w := Ht + (A* - z t )^A 


Notice that Algorithm 2.1 describes a whole family of secant methods which 
still depend on the choices of v k in the update d) and the step length t k in c). 
Strategies for the selection of these parameters will be discussed in Sections 2.2 
and 2.3. 

In the following lemma, we collect some simple recursions that are valid for all 
choices of v k and t k . Here and in the sequel, the notations 


Tk . = 

* ' Vk 2 k ’ 


Zi = zj fc) := HiAA k , and 7 ,• := 


VjZj 

VjZi 


(2.5) 


are used. 
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Lemma 2.2 Let v*z, ^ 0, i = 0, . . . , k — 1. Then: 

a) e*+i = ((1 — tk)I + t k E k ) e k , 

b) A k+i = (l - t k + r k )A k — r k z k ) = ((1 — t k )I '+ r k E k ) A k , (2.6) 

c) Zi+i = % + — ( A <+1 - (1 - <<)A,) , i = 0, . . . , k - 1 . 

Proof. Note that e fc+1 = e k - t k A* and A k = (I - E k )e k . Combining these 
two identities yields (2.6a). 

Next, one easily verifies that 

Afc+i = H k +ir k+i = (1 — t k )A k + r k (A k — z k ) . (2-7) 

Since E k A k = A k — z k , (2.7) immediately leads to (2.6b). 

By using (2.5) and the update formula which connects Hi+\ and Hi, one obtains 

?i+i = Zi + 7i(A,- - Zi) . (2.8) 

Finally, by rewriting the term A,- — z, in (2.8) by means of the first identity 
(with ifc = t) in (2.6b), one arrives at (2.6c). ■ 

2.2 Special Rank— 1 Updates 

First, note that, by (2.2), the error matrix associated with the preconditioner 
H k satisfies the update formula 

=eJi- . (2.9) 

V v t z k ) 

Clearly, one would like to improve the preconditioner from step to step. Thus, 
v k in (2.9) should be chosen such that a suitable norm of E k is decreasing. In 
this section, three special choices of v k are discussed. 

(A) The first one is the so-called Broyden’s “good” update [1], Here, in each 
iteration, one sets 

v k := A k . (2.10) 

Assume that H k is nonsingular and that x k ^ x, which implies A k ^ 0. 
With (2.10), (2.9) can be rewritten as 

E k +i = E k P k where P k := • ( 2 - n ) 
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Remark that, except for the trivial case HkA = I, P k in (2.11) is an 
oblique, non-orthogonal projection. Thus, one cannot guarantee that 
PM is decreasing. However, for the different error matrix 

E k :=I-A~ l B k , 


one obtains such a reduction property: 

Ek + 1 = E k Qk where Q k := • ( 2 - 12 ) 

Now, Qk is an orthogonal projection. Consequently, (2.12) guarantees an 
improvement of the preconditioner in each step, in the sense that 

||&ti| < u&i ( 2 - 13 > 


and 

(2- 14 ) 

Obviously, in view of (2.2) and (2.10), Broyden’s good update is only 
defined as long as 

AlHkAAk + 0 (2.15) 

which (cf. (2.3)) guarantees that with H k also H k + 1 is nonsingular. 

In particular, the more restrictive condition 

AlHkAAk > 0 (2.16) 


certainly implies (2.15). Clearly, (2.16) can be rewritten as 

A\EkAk 


e k ■= 


A*A* 


< 1 . 


Since e* < ||£fc||, a sufficient condition for (2.17) is 

m < i . 


(2.17) 


(2.18) 


Now, it is easily verified that E k and E k are connected by 


E k = -(I-Ek)- 1 E k , 


and it follows that 


P*ll < 


11M 

1 - \\E k \\ ' 


(2.19) 
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By (2.19), the condition 


ll&ll < \ (2-20) 

implies (2.18). If (2.20) is satisfied for k = 0, then (2.13) guarantees that 
(2.20) holds for all k. Finally, by (2.18), H k A and, since A is assumed to 
be nonsingular, H k is nonsingular. 

Therefore, we have proved the following 

Lemma 2.3 Let Ho be a nonsingular n x n matrix such that ||I?o|| < 5 . 
Then, Broyden’s good update (2.2), with v k chosen as in (2.10), is well 
defined as long as x k ^ x. 

(B) The so-called Broyden’s “bad” update [1] is obtained by choosing v k in 
(2.2) such that 

Hlv k = AA k = q k 
holds. Then, (2.9) reduces to 


E k+ i = E k (l-^A) . 

V 9*9* / 


( 2 . 21 ) 


Remark that Broyden’s bad update is well defined as long as A k ^ 0. In 
particular, no additional restrictions for H 0 are needed. 

For the special error matrix 


:= AE k A~^ — I — AH k , 
(2.21) leads to the update formula 

\ 9 k Qk/ 


( 2 . 22 ) 


(2.23) 


From (2.23), it follows that H k + 1 is an improved preconditioner, in the 
sense that 

(2.24) 

(2.25) 


and 


ll^+ill < ll&ll 

II-£*+i||f = II&IIf - ■ 
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(C) A third obvious choice for v k in (2.2) is 


v k := z k . 

The corresponding update (2.9) for the error matrix is 

E w = eJi- ■ ( 2.26) 

Here, one needs to ensure z k ^ 0. Obviously, this is guaranteed if H k is 
nonsingular and x k ^ x. If H k is nonsingular, then (2.26) can be rewritten 
in terms of an orthogonal projection as follows: 

E M = E„(H k AY' ( 7 -^W/t. ( 2 . 27 ) 

V Z k Z kJ 

However, unlike as for updates (A) and (B), (2.27) does not imply a reduc- 
tion property of some “natural” measure for the preconditioner H k . This 
suggests that this type of update is not competitive with Broyden’s good 
and bad ones. Indeed, this was confirmed by our numerical experiments. 


2.3 Line Search Principles 

In this section, the selection of the step length t k in part c) of Algorithm 2.1 is 
discussed. Ideally, one would like to choose t k such that 

||e*+.(*.)ll = mi® lk*+i(<)!l (2-28) 

<eC 

where 

e*+i(t) ‘ = Cfc t& k . 

Unfortunately, since x and hence e k is unavailable, the step length defined by 
(2.28) can not be computed. However, in view of 

Cfc+i(t) = A _1 r t+ i(<) where r k+1 (i) := r k - tq k , (2.29) 

(2.28) can be satisfied at least approximately by choosing t k such that 

||C*r* + i(**)|| = min ||C fc r* +1 (<)|| . (2.30) 

. «eC 

Here C k is some approximate inverse of A. At iteration k of Algorithm 2.1, 
there are three natural choices lor C kt namely H k +i, H k , or simply C* = J, 
which lead to the line search principles (a), (c), or (b), respectively. Next, these 
three strategies are discussed. 
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(a) With C k = H k +i and A fc +i = H k+1 r k+u (2.30) reads as follows: 

||Afc + i(<jt)|| = min||Ajfe + i(<)|| . 


(2.31) 


Using (2.6b) and the second relation in (2.29), one readily verifies that 
(2.31) is equivalent to 

AfcAfc+i = 0 where Ajt+i = (1 + t*)A* — r k z k — t k At . (2.32) 

Recall that T k was defined in (2.5) and note that r k still depends on the 
particular choice of the rank-1 update (2.2). 

Finally, from (2.32), it follows that the step length for the line search 
principle (2.31) is given by 


~ A!.?* 

t k = t k :=l+T k - 


(2.33) 


Note that for the special case, v k = A k , of Broyden’s good update , (2.33) 

leads to ... 

r _ _ AJA fc 

tk = T k = 


A * k Zk 


(2.34) 


(b) For C k = /, (2.30) reduces to 


||r* + i(t*)|| = min||r fc+1 (<)|| 

or, equivalently, 

q k r k + 1 = 0 where r k+ 1 = r k — t k q k . 
Hence, by (2.36), the minimization principle (2.35) leads to 

tk = t k :=2tJL. 

q k qk 

Remark that for Broyden’s bad update , (B), one has 

tk = T k . 


(2.35) 


(2.36) 


(2.37) 


(2.38) 


(c) With C k = H k , (2.30) specifies to 


||#*r* +1 (i fc )ll = roia||H*rfc + i(t)|| 

ieC 


(2.39) 
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By rewriting (2.39) in the form 

z* k HkTk + 1 = 0 where H k r k+X — At — t k z k , 

it follows that 

= (2.40) 

4 z k 

Here, for update (C), one has 

t° k = r k . (2.41) 

Notice that, in view of (2.34), (2.38), and (2.41), the choice t k = T k for the step 
length leads to a natural coupling of the three special rank-1 updates (A), (B), 
and (C) with the line search principles (a), (b), and (c), respectively. 

More general, for t k = r k , the following properties hold. 

Lemma 2.4 In Algorithm 2.1, let t k = T k be chosen and assume that v k z k ^ 0. 
Then: 

a) The iterate x k+x is uniquely defined by the Galerkin type condition 

H k r k+X ± v k and x k + x € x* + span{A*} , (2.42) 

b) H k+X r k+X = H k r k+X . 

Proof. By the second condition in (2.42), x*+i = x k + tA k and thus 

H k r k+1 = A* - tz k 

for some t G C. Together with the definition of r k in (2.5), it follows that 

vtA k 

v k H k r k+x = WfcAi — tv k z k = 0 ^ t = = T k , 

v k z k 

and this concludes the proof of a). 

Next, one easily verifies that 

H k r k+1 = (1 - t k ) A fc + t k E k A k . (2.43) 

By comparing (2.6b) and (2.43), one obtains the relation stated in b). ■ 

Remark that the classical step length used in combination with Broyden’s up- 
date (2.2) is t k = 1. Somewhat surprisingly, this choice guarantees that the 
resulting method — at least in theory — terminates after at most 2n steps with 
the exact solution of (1.1), as was shown by Gay [9] (cf. also [10]). Obviously, 
this finite termination property is not of practical importance for large sparse 
ljt )par systems. Here, we take another look at the choice t k = 1 . 
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Lemma 2.5 In Algorithm 2.1, assume that v k Zk ^ 0. Let x k and xt +2 be 
the iterates generated by two successive steps of Algorithm 2.1 with step length 
tk = tk + i = 1. Then: 


Xk+2 = 2 *+ i + H k r k + i , r k +2 — (T - AH k )r k +i , ( 2 - 44 ) 

where 

Xk+l = Xk *f X k Ak ) ^*fc+l — XkAHk^rk , (2.45) 

and 

v;a> 

Proof. Since = <*+1 = 1, we have 

x* + 2 — Xfc + At + Ajt+i • (2-46) 

For tfc = 1, the first identity in (2.6b) reduces to A*+i = t*(J — HkA)Ak and, 
thus, (2.46) can be rewritten as 

Xk + 2 = Xk + r k A k + Hk(I — TkAHk)rk ■ (2-47) 

Now, (2.44) and (2.45) readily follow from (2.47). ■ 

Note that, in view of part a) of Lemma 2.4, the intermediate quantity x*+i, 
(2.45), is just the Galerkin iterate in the sense of (2.42). 

Therefore, Lemma 2.5 shows that, by combining two successive steps, Algo- 
rithm 2.1 with tk = 1 can be interpreted as follows. At the beginning of step k, 
the approximate solution x* and the preconditioner Hk are available. From these 
quantities, the iterate x * + 2 of step k ■ f 2 is obtained by applying one Galerkin 
step , namely (2.45), followed by one step of Richardson iteration , namely (2.44), 
to the preconditioned linear system 

HkAx — Hkb . 

In general, the “virtual” iterate Xk+i and the actual iterate x k+l are different. 
Note that (2.44) is a Richardson step without line search. In particular, if H k 
— as is to be expected in the early stage of the iteration — is not yet a good 
approximation to A -1 , then (2.44) will lead to an increase rather than a decrease 
of j|r*; + 2 1| - In order to prevent such undesirable effects, it appears preferable to 
combine Broyden’s update with the line search principles (a), (b), or (c), instead 
of using tk = I- 
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3. Convergence Analysis 

In principle, Algorithm 2.1 could be implemented with any of the 9 combinations 
(Aa), . . (Cc) of rank-1 updates (A), (B), and (C) with line search strategies 
(a), (b), and (c). As already mentioned in Section 2.2, the update (C) is not 
competitive with (A) and (B), and, therefore, (C) is dropped here. Among 
the remaining 6 combinations, only the pairs (Aa), (Bb), and (Ac) will be 
considered. 

As a first step, the following auxiliary result for the case of the line search 
principle (b) is established. 

Lemma 3.1 In the general Algorithm 2.1, let the step length (2.37), tk = tk, 
be chosen. Then, 

< < m , (3.i) 

with Ek = I — AHk defined as in (2.22). 



Proof. From part c) of Algorithm 2.1 and (2.37), one obtains 


~ Qk r k 

r*+i = r k - tkqk — r k ; — Qk = 

9k<Jk 


V \ qtQkJ 


( r k - qk) • 


Since 

rk — qk = Ekrk , (3.2) 

it follows that 

||r».n = II (/ - %£) &r,n < |&r»|| < llftll . Ilr.ll , 

and thus (3.1) holds. ■ 

Recall from Section 2.2 that the error matrix E k is closely connected with 
Broyden’s bad update (B), cf. (2.23)-(2.25). In the following section, Lemma 
3.1 will be used to obtain a convergence result for update (B). 


3.1 Broyden’s Bad Update 

Theorem 3.2 (B-update) 

Consider Algorithm 2.1 with update (B) and step length tk = tk = rk, (2.37), or 
tk = 1. Assume that 

||£o|| < So < 1 . 
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Then, the iteration converges globally satisfying 

11^,11 Ikt+ill „ ll& r tll - T 

1^.11 " INI - INI - ° 


and 


r k > 0 for r k ± 0 . 

Moreover, if r k ^ 0 for all k = 0,1,..., then, 


lim t k = 1 , 

k—*oo 


and the convergence is superlinear in the sense that 

lim = 0 • 


(3.3) 

(3.4) 

(3.5) 


(3.6) 


Proof. First, global convergence is shown for it = t*. By Lemma 3.1, 

IK+lll < ll& r *,l! < iijg || . (3.7) 

INI " INI ” 11 11 

In view of (2.24), (3.7) implies 

< ll&ll < ll&ll < 5. < 1 • (3.8) 

By combining (3.7) and (3.8), the statement in (3.3) follows. By (2.37), (2.38), 
and (3.2), the step length t k = r k satisfies 


_ INI 2 
‘ “ " ilftll 2 ll«4 2 ' 


(3.9) 


Using (3.7), one deduces from (3.9) that 


T k > 


W! 

INI 2 


NjkN > 
INI 2 - 


(l - ll&ll) 


W! 

INI 2 


>0 for 


5 


and (3.4) holds true. 

Next, based on the Frobenius norm property (2.25), superlinear convergence 
is shown. Following the proof technique of BROYDEN /DENNIS/MORI-; [2], one 


obtains 


lim 

k—*<x> 


ll&NI 

INI 


= 0 . 


(3.10) 
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By (3.8), the assumption ?o < 1 guarantees that the matrix (I — E k ) is nonsin- 
gular. Thus the relation (3.2) can be rewritten as 

r » = (/-£*)''* • ( 3 -H) 


Using (3.11) and (3.8), one readily verifies that 

H&r4 _ 1|&(J- flQ-VH < 1 + |l^ fc |l \\Bkqk\\ ^ 1 + 5) JLgfcgfcjl 
INI \\(I - £ k )-'q k \\ -1-||^||' M ~l-5> Ikfcll 

Therefore, by means of (3.7) and (3.10), one concludes that 


lim 

it— ► oo 


Jhiil 

INI 


< i±* lim Jl^ll 
i - s 0 k ~°° M 


= 0, 


which immediately yields (3.6). Similarly, from 


qt r k _ gt(J ~ 
QkQk 


one deduces that 


\ik - 1| < 


1 

l-$o 


Ml 


Therefore, (3.10) implies 

lim 1 1* - 1| = 0 

K— MX> 


which confirms (3.5). 

For the case t k — 1, the relation (2.6a) reduces to 


e*+ 1 = E k e k 


which is equivalent to 

rit+i = E k r k . 

Now (3.12), also yields (3.7). The rest of the proof can just be copied. 


(3.12) 


3.2 Broyden’s Good Update 

Theorem 3.3 (A-update) 

Consider Algorithm 2.1 with update (A) and line search either (a) or (c). As- 
sume that 

||fo|| < So < § . (3.13) 
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Then, the iteration converges globally satisfying 

ll&frll + JllfcM 


.KmJI < jlffcjl HAtll < < i 




1 - 


\ \EkA4 

l|A*|| 


l — ^o 


(3.14) 


and, 


Tic > 0 for e*; ^ 0 . (3.15) 

Moreover, if e k ± 0 /or all k = 0,1,. . then the convergence is superlinear with 


and 


lim ^7^ = 0 

*^°° INI 


lim t/c = 1 - 

k—>oc 


(3.16) 

(3.17) 


Proof. First, line search (a) with t k = T k (cf. (2.34)) is considered. Rewriting 
(2.6a) in terms of E k yields 


By means of the relations 

1 - Tfc = 1 - 

and 

one obtains from (3.18) 


ek+i = (1 -r k )A k — E k A k . 

AjJAt Aj(z fc - A fc ) 


(3.18) 


A k z k A \z k 

A ic = (I - E k )z k , 


A k E k z k = 

e ‘ +i = — • A ‘ ~ £lA ‘ • 

Moreover, using the formula (3.19) once more, one easily verifies that 
|AJ.Ejtz*| | (zjk — EjcZkYEkZkl 


(3.19) 


(3.20) 


|A*z*| |(z fc - E k z k ) m z k \ 

L ll%*ll 2 


Ij^EkZk | 

INI 2 


zjE k z k 


1 - 


ZjEkzk 

IMP 


_ 1 — £ k 

< € k ' ~ = £k , 

1 ~ £k 
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where 


£i ■ irar - l|£l11 - 40 ■ 

With these inequalities, (3.20) leads to the estimates 

4 ”' 


(3.21) 


Finally, with 
one obtains 


e/t = (I + Ek)Ak , 


l e *l 


/ n&AtiA 

2 V l|A*|| / 


I|a,||>(i-4)I|a 4 || 


(3.22) 


By comb ining (3.21) and (3.22), one ends up with (3.14). 

Similarly, one shows 

T ‘ =1 — apt- 1 -' 1 ’ 

which certainly confirms the assertion (3.15). 

In order to prove superlinear convergence, first remark that the Frobenius norm 
result (2.14) implies 

ii if. a . ii 

(3.23) 


lim = 0 


||A fc || 

Along lines similar as in the proof of Theorem 3.2, one then verifies that 


lim ik = 0 . 

fc— ► oo 


(3.24) 


Now, by using (3.23), (3.24), and the estimates in (3.14) of this theorem, one 
obtains (3.16) and, with 

\h ~ 1 | < ?k , 

also (3.17). This completes the proof for line search (a). 

Next, consider the line search principle (c) where, by (2.40), 


t k = tl = 


z\Zk 


As before, one starts with 


e*+i = (1 — f*)A* - EkAk 
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which now leads to 


e fc+i = 


fggfcffc 

z\z k 


Afc — Ek&k • 


From this, one derives the estimate 


IK.ll 

IIA.1I - £ ‘ + HA.II ’ 

which is the same as (3.21). The rest of the proof can essentially be copied. ■ 

For the choice t k = 1, Broyden’s classical good method is obtained. The con- 
vergence behavior of this algorithm is studied in BROYDEN /Dennis/Mor£ 
[2]. In this case, the assumption (3.13) can be relaxed to 6o < 3- As already 
mentioned, the choice t k = 1 guarantees that Broyden’s good method stops 
after at most 2n steps. A slight modification, the so-called projected Broyden’s 
method, even terminates after at most n iterations. This algorithm is analyzed 
in Gay/Schnabel [8]. 


Conjecture. The authors were unable to get rid of the factor 2 in (3.14). If 
this factor drops, then only 6 0 < \ would be required — which seems to be 
more reasonable in view of (2.20). 


3.3 An Illustrative Example 

In this section, we discuss a simple illustrative example, namely a convection- 
diffusion problem in 1-D. Consider the ODE boundary value problem 


a) — u" + /?u' = 0 on (0, 1) , 

b) u(0) = 1 , u(l) = 0 . 


(3.25) 


By using upwind discretization on a uniform grid with step size h = 1/n, (3.25) 
leads to a linear system Ax = 6 with the diagonally dominant tridiagonal matrix 


2 + 0h -1 

-(1 + 0h) ••• 

-1 

-(l + 0/i) 2 + fih 


(3.26) 


Let n = 50 and set b = (1,0 , .. . ,0) T . Moreover, choose x 0 as the prolongation 
obtained from the exact solution on the coarser grid h = 1/25. For H 0 , we 
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chose simple diagonal preconditioning as in (5.1). In this case, one is able 
to compute all quantities of interest directly and to compare the convergence 
theory of Sections 3.1 and 3.2 with the actual behavior of the algorithms — see 
Table 3.1. 



\\Eo\\ 

H&il 

II^Voll/IMI 

l|£o*o||/IMI 

ll&Aoli/llAoll 

lO 

II 

415 

0.99 

0.53 

0.43 

2.72 

/? = 100 

64 

0.99 

0.37 

0.28 

0.24 


Table 3.1: Quantities used in convergence theory of Sections 3.1 
and 3.2 for Example (3.25). 


These results seem to justify the relaxation of the rather restrictive convergence 
criteria in Sections 3.1 and 3.2 — compare (2.16) and (2.17) in the light of 
(2.18), (2.20), and (3.13). 

In Fig. 3.1 and Fig. 3.2, the convergence history of 3 codes (see Section 5 for a 
description of these codes) is compared — both in terms of the residual norms 
||r*|| and the error norms ||e*||. rTT“““v'- : 

In this example, both GMRES and the “bad Broyden” code BB suc cessively 
reduce the residual norm, whereas the “good Broyden” code GB reduces the 
error norm — a property that has been shown to hold at least asymptotically 
without a storage restriction: just compare the minimization property (2.31), 
|| Ajt-t-i j] = min, for t* = t* with the asymptotic property (see (4.2) below) 
||Ajt + ijj = || ejt-t-i || for n = 1. As an illustration, Fig. 3.3 gives a comparison of 
the true and estimated errors. 
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Residual 



Figure 3.1: Comparative residual norms ||r*|| 2 for 3 iterative solvers for Ex- 
ample (3.25). 


Error 



Figure 3.2: Comparative error norms ||ejt ||2 f° r 3 iterative solvers with 

ibnuuc = 10 for Example (3.25). 
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2 4 6 8 10 12 14 16 18 20 

Figure 3.3: Iterative behavior of true errors ||e*|| and estimated errors |[A/t|| 
for GB(3) in Example (3.25). 


22 




4. Details of Realization 

The secant methods based on Algorithm 2.1 with update either (A) or (B) are, 
of course, implemented in a storage saving compact form. 

Algorithm (A): “Good Broyden” 

Start: a) r 0 := b — Ax o 

Ao := H 0 r 0 
<7 0 := AJA 0 

Iteration loop: k = 0, 1, ... : 

b) q k := AA k 
z 0 := H 0 q k 

Update loop: t = 0, . . . , k — 1 (for A > 1) 

c) •?.•+! := Zi + ^(A i+1 - (1 - t.)A.) 

7« T i 

d) z k := z k 
7* := A k z k 
r k := <T k hk 

t k := r fc or t k := t k = qtr k /qlq k or := 1 
xjb+i := x* + t k Afc 
rfc+i := nt — 

Afc+i := (1 - tk + Tfc)A* - r k z k 
0k + i := AJ +1 Afc + i 

Array storage. The above implementation requires to store (up to iteration 
step k) the vectors 

Ao, • • • , Ajt, q,z = z , 

for step length t k = r* or t k = 1 and, in addition, r = r k in the case of step 
length t k = t k , which sum up to 

(k + 2)n or (k + 3)n (4.1) 


storage places. 
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Operation count. Per iteration step k one needs 1 matrix-vector multipli- 
cation, 1 solution of a preconditioned system of the form BqZ = q in order 
to obtain z 0 in b), and (2k -f 7 )n multiplications. Obviously, the inner loop 
vectorizes. 

Termination criteria. Under the assumptions of Theorem 3.3 (cf. (3.22)), 

which means that, at least asymptotically, || A*|| is a reasonable computationally 
available estimate for ||ejt]| to be written as 

IIA.II = IM . (4.2) 

This motivates the convergence criterion 

< e (4-3) 

where e is some relative accuracy parameter to be specified by the user. 

In order to ensure that Hk+iA is nonsingular, recall condition (2.17), which 
reads e k < 1 in the notation of Section 2.2. By replacing this condition by the 
stricter one 

e* < 1 , 

Tm&x 

we arrive at a restart condition 



Tfc ^ 0 Or Tjfc > Tmax 

which can be easily monitored. Here, T max is some internal parameter, and we 
have chosen w = 10 in all the numerical experiments described in this paper. 

Remark 1 . The convergence criterion (4.3) nicely agrees with requirements 
needed in the global inexact Newton algorithm for nonlinear problems as given 
by DEUFLHARD [4]. 

Remark 2. Clearly, the Euclidean inner product in Algorithm (A) can be 
replaced by any other inner product (*,*) — possibly scaled and certainly de- 
pending on the problem to be solved. 
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Algorithm (B): “Bad Broyden” 


Start: a) r 0 := b — Ax 0 


A 0 := H 0 r 0 


Iteration loop: k = 0, 1, . . . : 

b) q k := A A* 

Zq := Hoq k 

Update loop: i — 0, . . . , k — 1 (for k > 1) 

c) z,+ 1 := *i + -|^(A;+i — ^ ~ **)A«) 

d) z k := z k 


Pk := 



Zfc+1 x k + tk&k 
Tk+ 1 := r k - t k q k 
Afc+i : = A k t k z k 


The version for t k = 1 was ignored for obvious reasons. 


Array Storage. The implementation of this algorithm requires to store (up 
to iteration step k) the vectors 

Aqi •• • i Afc) qo, • • ■ , qki z 1 


which sums up to 


(2k + 2 )n 


storage places — to be compared with (4.1). 


Operation count. Per iterative step k one needs 1 matrix-vector multipli- 
cation, again solution of 1 linear preconditioned system with B 0 as coefficient 
matrix, and (2Jt -I- 8)n multiplications. Once more, the inner loop easily vector- 
izes. 


Termination criteria. Since update (B) is closely connected with minimiza- 
tion principle (2.35), the convergence criterion for Algorithm (B) will be based 


25 


on the residual norm. In view of the property (for t k = r k ) 


r fc+l r fc+l — r k r k tkQkQk > 

Algorithm (B) is stopped as soon as 

llnt+ill <*IM 

is reached. Again, e is to be specified by the user. Moreover, the iteration is 
restarted , if 

M ■ Ikll < e|MI . 

Note that in view of Theorem 3.2, the iteration would need to be just termi- 
nated, if 

ll& r *ll _ Ik* - r 4 ^ , 

INI INI 

which can be shown to be equivalent to the condition 

tk = T k < 5 . 


Restricted storage versions 

For large n, one needs to restrict storage to some m • n such that 

km*, + 2 = m for ( Aa ) , 

2k max + 2 = m for ( Bb ) . 

Several options are possible to satisfy this restriction. 

(I) Both Algorithms (A) and (B) can be just restarted after fcmax iterations 
using as the new starting guess x 0 . Under the assumptions of the 

convergence theorems in Sections 3.1 and 3.2, these restricted variants can 
be shown to converge linearly. 

(II) Both (A) and (B) can be modified by restricting the update loop to indices 

i = 0 ,..., — 1 (initial window) . 

j 

This means a fixed preconditioning of the problem associated with H k 

— with preconditioning from the right in (B) and from the left in (A). 

Again, linear convergence can be shown under the assumptions made in 
Sections 3.1 and 3.2. 
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(Ill) Once k> k^ is reached, one may also consider restricting the update 
loop to indices 


i = k— km **, ... ,k (moving window) . 


For update (A), such a variant seems to be hard to interpret. For update 
(B), however, the update loop c) in Algorithm (B) can be solved to yield 


a) 


jt-i 


z k = H 0 AA k + £ 7 * • (A i+1 - (1 - ti)Ai) 
»=0 


with factors 


b) ntik == 


g iQk 

Uq'q, ' 


Note that the corresponding factors 7 ,- * in Algorithm (A) would contain 
Obviously, the moving window variant in Algorithm (B) means replacing the 
above sum by its most recent iterative contributions. Such a variant might seem 
reasonable in view of the super linear convergence properties of secant methods. 
However, it is unclear whether such a variant converges at all. 

Each of the above restricted storage versions was implemented and tested on 
several e xam ples. It turns out that all the window variants are not competitive 
with variant (/). Therefore, only (J) will be studied in Section 5. 
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5. Numerical Experiments 

On the basis of the above derivation, the following storage restricted algorithms 
are compared here: 

GB(fc mAX ): Update (A) with line search (a), Broyden’s “good” 

method, restricted storage version (I). 

BBfL..): Update (B) with line search (6), Broyden’s “bad” 

method, restricted storage version (I). 

GMRES-L(Jfc max ): Program GMRES(fc) [13] with left preconditioning. 

GMRES-R(fc max ): As above, but with the usual right preconditioning. 

Any other variants of GB or BB are not included here, since their performance 
was not competitive with the two versions above. This excludes both window 
variants (II) and (III) of Section 4 and the different line searches tk ^ t* for 
GB. The distinction of left and right preconditioning for GMRES has been 
made deliberately, since GB may be understood as some successively refined 
left preconditioner, whereas BB may be interpreted as some successively refined 
right preconditioner — which can be seen in the matrices Ek for GB and &k for 
BB. 

Recall from Section 4 that BB(fc nuut ) requires about twice the array storage as 
the other 3 codes. Moreover, the GB code and the GMRES codes supply the 
residual vector only, if explicitly wanted. If the successive iterates x* are explic- 
itly wanted (say, within an adaptive code or a nonlinear code [4]), then both 
GMRES codes need some modification, which in GMRES-R includes an addi- 
tional preconditioned system solve per each iteration. Throughout the present 
section, only the rather simple preconditioning 

H 0 D , D : = diag(axi, .. . ,a„n) , (^d) 

is chosen. In a PDE context, this preconditioner takes care of the elliptic part 
(cf. [5]) — the rest must be taken care of by the rank-1 updates. A detailed 
study of different preconditioning techniques in a PDE setting will be given 
elsewhere. 

Our test examples arise from convection-diffusion problems in 2-D of the fol- 
lowing type: 

a) — eAu + f} • Vu = / on fi C R 2 , 

r\ 

b) u|r„ = tto , jr- =o on, 5n = rouri, ronri=0. 

dn Fi 

In order to solve this problem, streamline upwind discretization with anisotropic 
adaptive grid refinement due to KoRNHUBER/RoiTZSCH [12]) is used. 
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Example 1. Circular layer problem 

As a first special case of (5.1), we study a problem with a circular layer. For 
this, we set e = 10 -5 , / = 0 and /? = (y, -x). The domain U is (0, l) x (0, l)\r 0 
with r 0 = {(x,y) : x = 0.5, y < 0.5}. On the inflow boundary, we prescribe 




0 if y > 0.3 

1 if y < 0.3 


(x,y) € To • 


In Fig. 5.1, the underlying grid with n = 4238 is shown. Starting point xo is 
the interpolated solution on a coarser grid. 



Figure 5.1: Anisotropic grid for Example 1, due to [12], 

With only diagonal preconditioning, the BB code fails to solve the problem 
within n steps (nearly constant residual norm throughout the iteration). First, 
the behavior of GMRES with fcmu — 10 has been studied (Fig. 5.2 and 5.3), 
which led to the selection of GMRES-R(IO) as best version. This version has 
been compared with GB(10) — see Fig. 5.4. To measure the error norms, the 
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final iterate of a GB(10) run with required relative accuracy e = 10 -8 in (4.3) has 
been taken as an estimate of the exact solution. Unlike the illustrative example 
in Section 3.3, the estimated error in GB(10) behaves only qualitatively as the 
true error — compare Fig. 5.5. Asymptotically, true and estimated error exhibit 
the same behavior, apart from oscillations caused by the A^c-restriction. 


Error in GMRES(k) 



Figure 5.2: Comparison of 3 GMRES versions with right preconditioning in 
Example 1. 
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Figure 5.3: Comparison of left and right preconditioning in Example 1 



Figure 5.4: Comparison of error for GMRES and GB in Example 1. 
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0 50 100 150 200 250 300 350 

Figure 5.5: Comparison of true and estimated error in GB(10) in Example 1. 
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Example 2. Straight interior layer problem 

The second test case was the convection-diffusion equation (5.1) on 0 = (0, 1) x 
(0, 1) with a straight interior layer. To obtain this, we set e = 10 -6 , / = 0 and 
f) = (1.0; 0.5). The inflow boundary To is given by To = {(^M/) € dtt : 
max(x, y) < 1}. We prescribe the boundary condition 


«o(z,y) 


0 if y > 0.3 

1 if y < 0.3 


(x,y) € T 0 . 


In Fig. 5.6, the final grid with n = 2874 is shown. 

The behavior of the true error with diagonal preconditioning during the iteration 
is shown in Fig. 5.7. Once more, as in Example 1, GB appears to be the best 
solver. Note that the behavior in case k ^ = 5 is typical also for other choices 
of fcnuix • 



Figure 5.6: Anisotropic grid for Example 2, due to [12]. 
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Enor 



Figure 5.7: Comparison of error in GB(5), GMRES(5) and BB(5) in Exam- 
ple 2. 
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Conclusion 


Two variants of secant methods based on Broy den’s “good” and “bad” rank-1 
updates have been studied. It turned out to be important that each update 
technique is combined with its associated line search. In comparison with GM- 
RES, the up to now bad reputation of secant methods for linear problems is 
certainly not justified, if a reasonable preconditioning is at hand. Especially, 
the “good” Broyden variant appeared to be the more competitive, the larger 
the system dimension was. This observation is backed not only by the given 
examples, but also by further more extensive tests. In the context of multilevel 
discretizations of PDEs, the derived secant methods seem to have the struc- 
tural advantage that the arising inner products can be especially adapted to 
the underlying PDE problem. 
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